8 research outputs found

    High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation

    Full text link
    The task of image-level weakly-supervised semantic segmentation (WSSS) has gained popularity in recent years, as it reduces the vast data annotation cost for training segmentation models. The typical approach for WSSS involves training an image classification network using global average pooling (GAP) on convolutional feature maps. This enables the estimation of object locations based on class activation maps (CAMs), which identify the importance of image regions. The CAMs are then used to generate pseudo-labels, in the form of segmentation masks, to supervise a segmentation model in the absence of pixel-level ground truth. In case of the SEAM baseline, a previous work proposed to improve CAM learning in two ways: (1) Importance sampling, which is a substitute for GAP, and (2) the feature similarity loss, which utilizes a heuristic that object contours almost exclusively align with color edges in images. In this work, we propose a different probabilistic interpretation of CAMs for these techniques, rendering the likelihood more appropriate than the multinomial posterior. As a result, we propose an add-on method that can boost essentially any previous WSSS method, improving both the region similarity and contour quality of all implemented state-of-the-art baselines. This is demonstrated on a wide variety of baselines on the PASCAL VOC dataset. Experiments on the MS COCO dataset show that performance gains can also be achieved in a large-scale setting. Our code is available at https://github.com/arvijj/hfpl

    End-to-end Reinforcement Learning for Online Coverage Path Planning in Unknown Environments

    Full text link
    Coverage path planning is the problem of finding the shortest path that covers the entire free space of a given confined area, with applications ranging from robotic lawn mowing and vacuum cleaning, to demining and search-and-rescue tasks. While offline methods can find provably complete, and in some cases optimal, paths for known environments, their value is limited in online scenarios where the environment is not known beforehand, especially in the presence of non-static obstacles. We propose an end-to-end reinforcement learning-based approach in continuous state and action space, for the online coverage path planning problem that can handle unknown environments. We construct the observation space from both global maps and local sensory inputs, allowing the agent to plan a long-term path, and simultaneously act on short-term obstacle detections. To account for large-scale environments, we propose to use a multi-scale map input representation. Furthermore, we propose a novel total variation reward term for eliminating thin strips of uncovered space in the learned path. To validate the effectiveness of our approach, we perform extensive experiments in simulation with a distance sensor, surpassing the performance of a recent reinforcement learning-based approach

    Camera-Based Friction Estimation with Deep Convolutional Neural Networks

    No full text
    During recent years, great progress has been made within the field of deep learning, and more specifically, within neural networks. Deep convolutional neural networks (CNN) have been especially successful within image processing in tasks such as image classification and object detection. Car manufacturers, amongst other actors, are starting to realize the potential of deep learning and have begun applying it to autonomous driving. This is not a simple task, and many challenges still lie ahead. A sub-problem, that needs to be solved, is a way of automatically determining the road conditions, including the friction. Since many modern cars are equipped with cameras these days, it is only natural to approach this problem with CNNs. This is what has been done in this thesis. First, a data set is gathered which consists of 37,000 labeled road images that are taken through the front window of a car. Second, CNNs are trained on this data set to classify the friction of a given road. Gathering road images and labeling them with the correct friction is a time consuming and difficult process, and requires human supervision. For this reason, experiments are made on a second data set, which consist of 54,000 simulated images. These images are captured from the racing game World Rally Championship 7 and are used in addition to the real images, to investigate what can be gained from this. Experiments conducted during this thesis show that CNNs are a good approach for the problem of estimating the road friction. The limiting factor, however, is the data set. Not only does the data set need to be much bigger, but it also has to include a much wider variety of driving conditions. Friction is a complex property and depends on many variables, and CNNs are only effective on the type of data that they have been trained on. For these reasons, new data has to be gather by actively seeking different driving conditions in order for this approach to be deployable in practice.Under de senaste Ären har det gjorts stora framsteg inom maskininlÀrning, sÀrskilt gÀllande neurala nÀtverk. Djupa neurala nÀrverk med faltningslager, eller faltningsnÀtverk (eng. convolutional neural network) har framför allt varit framgÄngsrika inom bildbehandling i problem sÄ som bildklassificering och objektdetektering. Biltillverkare, bland andra aktörer, har nu börjat att inse potentialen av maskininlÀrning och pÄbörjat dess tillÀmpning inom autonom körning. Detta Àr ingen enkel uppgift och mÄnga utmaningar finns fortfarande framöver. Ett delproblem som mÄste lösas Àr ett sÀtt att automatiskt avgöra vÀglaget, dÀr friktionen ingÄr. Eftersom mÄnga nya bilar Àr utrustade med kameror Àr det naturligt att försöka tackla detta problem med faltningsnÀtverk, vilket Àr varför detta har gjorts under detta examensarbete. Först samlar vi in en datamÀngd bestÄendes av 37 000 bilder tagna pÄ vÀgar genom framrutan av en bil. Dessa bilder kategoriseras efter friktionen pÄ vÀgen. Sedan trÀnar vi faltningsnÀtverk pÄ denna datamÀngd för att klassificera friktionen. Att samla in vÀgbilder och att kategorisera dessa Àr en tidskrÀvande och svÄr process och krÀver mÀnsklig övervakning. Av denna anledning utförs experiment pÄ en andra datamÀngd bestÄendes av 54 000 simulerade bilder. Dessa har blivit insamlade genom spelet World Rally Championship 7 dÀr syftet Àr att undersöka om prestandan pÄ nÀtverken kan ökas genom simulerat data och dÀrmed minska kravet pÄ storleken av den riktiga datamÀngden. De experiment som har utförts under examensarbetet visar pÄ att faltningsnÀtverk Àr ett bra tillvÀgagÄngssÀtt för att skatta vÀgfriktionen. Den begrÀnsande faktorn i det hÀr fallet Àr datamÀngden. DatamÀngden behöver inte bara vara större, men den mÄste framför allt tÀcka in ett bredare urval av vÀglag och vÀderförhÄllanden. Friktion Àr en komplex egenskap och beror pÄ mÄnga variabler, och faltningsnÀtverk Àr endast effektiva pÄ den typen av data som de har trÀnats pÄ. Av dessa anledningar behöver ny data samlas in genom att aktivt söka efter nya körförhÄllanden om detta tillvÀgagÄngssÀtt ska vara tillÀmpbart i praktiken

    Camera-Based Friction Estimation with Deep Convolutional Neural Networks

    No full text
    During recent years, great progress has been made within the field of deep learning, and more specifically, within neural networks. Deep convolutional neural networks (CNN) have been especially successful within image processing in tasks such as image classification and object detection. Car manufacturers, amongst other actors, are starting to realize the potential of deep learning and have begun applying it to autonomous driving. This is not a simple task, and many challenges still lie ahead. A sub-problem, that needs to be solved, is a way of automatically determining the road conditions, including the friction. Since many modern cars are equipped with cameras these days, it is only natural to approach this problem with CNNs. This is what has been done in this thesis. First, a data set is gathered which consists of 37,000 labeled road images that are taken through the front window of a car. Second, CNNs are trained on this data set to classify the friction of a given road. Gathering road images and labeling them with the correct friction is a time consuming and difficult process, and requires human supervision. For this reason, experiments are made on a second data set, which consist of 54,000 simulated images. These images are captured from the racing game World Rally Championship 7 and are used in addition to the real images, to investigate what can be gained from this. Experiments conducted during this thesis show that CNNs are a good approach for the problem of estimating the road friction. The limiting factor, however, is the data set. Not only does the data set need to be much bigger, but it also has to include a much wider variety of driving conditions. Friction is a complex property and depends on many variables, and CNNs are only effective on the type of data that they have been trained on. For these reasons, new data has to be gather by actively seeking different driving conditions in order for this approach to be deployable in practice.Under de senaste Ären har det gjorts stora framsteg inom maskininlÀrning, sÀrskilt gÀllande neurala nÀtverk. Djupa neurala nÀrverk med faltningslager, eller faltningsnÀtverk (eng. convolutional neural network) har framför allt varit framgÄngsrika inom bildbehandling i problem sÄ som bildklassificering och objektdetektering. Biltillverkare, bland andra aktörer, har nu börjat att inse potentialen av maskininlÀrning och pÄbörjat dess tillÀmpning inom autonom körning. Detta Àr ingen enkel uppgift och mÄnga utmaningar finns fortfarande framöver. Ett delproblem som mÄste lösas Àr ett sÀtt att automatiskt avgöra vÀglaget, dÀr friktionen ingÄr. Eftersom mÄnga nya bilar Àr utrustade med kameror Àr det naturligt att försöka tackla detta problem med faltningsnÀtverk, vilket Àr varför detta har gjorts under detta examensarbete. Först samlar vi in en datamÀngd bestÄendes av 37 000 bilder tagna pÄ vÀgar genom framrutan av en bil. Dessa bilder kategoriseras efter friktionen pÄ vÀgen. Sedan trÀnar vi faltningsnÀtverk pÄ denna datamÀngd för att klassificera friktionen. Att samla in vÀgbilder och att kategorisera dessa Àr en tidskrÀvande och svÄr process och krÀver mÀnsklig övervakning. Av denna anledning utförs experiment pÄ en andra datamÀngd bestÄendes av 54 000 simulerade bilder. Dessa har blivit insamlade genom spelet World Rally Championship 7 dÀr syftet Àr att undersöka om prestandan pÄ nÀtverken kan ökas genom simulerat data och dÀrmed minska kravet pÄ storleken av den riktiga datamÀngden. De experiment som har utförts under examensarbetet visar pÄ att faltningsnÀtverk Àr ett bra tillvÀgagÄngssÀtt för att skatta vÀgfriktionen. Den begrÀnsande faktorn i det hÀr fallet Àr datamÀngden. DatamÀngden behöver inte bara vara större, men den mÄste framför allt tÀcka in ett bredare urval av vÀglag och vÀderförhÄllanden. Friktion Àr en komplex egenskap och beror pÄ mÄnga variabler, och faltningsnÀtverk Àr endast effektiva pÄ den typen av data som de har trÀnats pÄ. Av dessa anledningar behöver ny data samlas in genom att aktivt söka efter nya körförhÄllanden om detta tillvÀgagÄngssÀtt ska vara tillÀmpbart i praktiken

    Camera-Based Friction Estimation with Deep Convolutional Neural Networks

    No full text
    During recent years, great progress has been made within the field of deep learning, and more specifically, within neural networks. Deep convolutional neural networks (CNN) have been especially successful within image processing in tasks such as image classification and object detection. Car manufacturers, amongst other actors, are starting to realize the potential of deep learning and have begun applying it to autonomous driving. This is not a simple task, and many challenges still lie ahead. A sub-problem, that needs to be solved, is a way of automatically determining the road conditions, including the friction. Since many modern cars are equipped with cameras these days, it is only natural to approach this problem with CNNs. This is what has been done in this thesis. First, a data set is gathered which consists of 37,000 labeled road images that are taken through the front window of a car. Second, CNNs are trained on this data set to classify the friction of a given road. Gathering road images and labeling them with the correct friction is a time consuming and difficult process, and requires human supervision. For this reason, experiments are made on a second data set, which consist of 54,000 simulated images. These images are captured from the racing game World Rally Championship 7 and are used in addition to the real images, to investigate what can be gained from this. Experiments conducted during this thesis show that CNNs are a good approach for the problem of estimating the road friction. The limiting factor, however, is the data set. Not only does the data set need to be much bigger, but it also has to include a much wider variety of driving conditions. Friction is a complex property and depends on many variables, and CNNs are only effective on the type of data that they have been trained on. For these reasons, new data has to be gather by actively seeking different driving conditions in order for this approach to be deployable in practice.Under de senaste Ären har det gjorts stora framsteg inom maskininlÀrning, sÀrskilt gÀllande neurala nÀtverk. Djupa neurala nÀrverk med faltningslager, eller faltningsnÀtverk (eng. convolutional neural network) har framför allt varit framgÄngsrika inom bildbehandling i problem sÄ som bildklassificering och objektdetektering. Biltillverkare, bland andra aktörer, har nu börjat att inse potentialen av maskininlÀrning och pÄbörjat dess tillÀmpning inom autonom körning. Detta Àr ingen enkel uppgift och mÄnga utmaningar finns fortfarande framöver. Ett delproblem som mÄste lösas Àr ett sÀtt att automatiskt avgöra vÀglaget, dÀr friktionen ingÄr. Eftersom mÄnga nya bilar Àr utrustade med kameror Àr det naturligt att försöka tackla detta problem med faltningsnÀtverk, vilket Àr varför detta har gjorts under detta examensarbete. Först samlar vi in en datamÀngd bestÄendes av 37 000 bilder tagna pÄ vÀgar genom framrutan av en bil. Dessa bilder kategoriseras efter friktionen pÄ vÀgen. Sedan trÀnar vi faltningsnÀtverk pÄ denna datamÀngd för att klassificera friktionen. Att samla in vÀgbilder och att kategorisera dessa Àr en tidskrÀvande och svÄr process och krÀver mÀnsklig övervakning. Av denna anledning utförs experiment pÄ en andra datamÀngd bestÄendes av 54 000 simulerade bilder. Dessa har blivit insamlade genom spelet World Rally Championship 7 dÀr syftet Àr att undersöka om prestandan pÄ nÀtverken kan ökas genom simulerat data och dÀrmed minska kravet pÄ storleken av den riktiga datamÀngden. De experiment som har utförts under examensarbetet visar pÄ att faltningsnÀtverk Àr ett bra tillvÀgagÄngssÀtt för att skatta vÀgfriktionen. Den begrÀnsande faktorn i det hÀr fallet Àr datamÀngden. DatamÀngden behöver inte bara vara större, men den mÄste framför allt tÀcka in ett bredare urval av vÀglag och vÀderförhÄllanden. Friktion Àr en komplex egenskap och beror pÄ mÄnga variabler, och faltningsnÀtverk Àr endast effektiva pÄ den typen av data som de har trÀnats pÄ. Av dessa anledningar behöver ny data samlas in genom att aktivt söka efter nya körförhÄllanden om detta tillvÀgagÄngssÀtt ska vara tillÀmpbart i praktiken

    Importance Sampling CAMs for Weakly-Supervised Segmentation with Highly Accurate Contours

    Full text link
    Classification networks have been used in weakly-supervised semantic segmentation (WSSS) to segment objects by means of class activation maps (CAMs). However, without pixel-level annotations, they are known to (1) mainly focus on discriminative regions, and (2) to produce diffuse CAMs without well-defined prediction contours. In this work, we alleviate both problems by improving CAM learning. First, we incorporate importance sampling based on the class-wise probability mass function induced by the CAMs to produce stochastic image-level class predictions. This results in segmentations that cover a larger extent of the objects, as shown in our empirical studies. Second, we formulate a feature similarity loss term, which further improves the alignment of predicted contours with edges in the image. Furthermore, we shed new light onto the problem of WSSS by measuring the contour F-score as a complement to the common area mIoU metric. We show that our method significantly outperforms previous methods in terms of contour quality, while matching state-of-the-art on region similarity.Comment: Additional experiments/result

    Balanced Product of Experts for Long-Tailed Recognition

    Full text link
    Many real-world recognition problems suffer from an imbalanced or long-tailed label distribution. Those distributions make representation learning more challenging due to limited generalization over the tail classes. If the test distribution differs from the training distribution, e.g. uniform versus long-tailed, the problem of the distribution shift needs to be addressed. To this aim, recent works have extended softmax cross-entropy using margin modifications, inspired by Bayes' theorem. In this paper, we generalize several approaches with a Balanced Product of Experts (BalPoE), which combines a family of models with different test-time target distributions to tackle the imbalance in the data. The proposed experts are trained in a single stage, either jointly or independently, and fused seamlessly into a BalPoE. We show that BalPoE is Fisher consistent for minimizing the balanced error and perform extensive experiments to validate the effectiveness of our approach. Finally, we investigate the effect of Mixup in this setting, discovering that regularization is a key ingredient for learning calibrated experts. Our experiments show that a regularized BalPoE can perform remarkably well in test accuracy and calibration metrics, leading to state-of-the-art results on CIFAR-100-LT, ImageNet-LT, and iNaturalist-2018 datasets. The code will be made publicly available upon paper acceptance.Comment: 19 pages, under revie
    corecore